Initial Experiences Re-Exporting Duplicate and Similarity Computation with an OAI-PMH aggregator

نویسندگان

  • Terry L. Harrison
  • Aravind Elango
  • Johan Bollen
  • Michael L. Nelson
چکیده

The proliferation of the Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH) has resulted in the creation of a large number of service providers, all harvesting from either data providers or aggregators. If data were available regarding the similarity of metadata records, service providers could track redundant records across harvests from multiple sources as well as provide additional end-user services. Due to the large number of metadata formats and the diverse mapping strategies employed by data providers, similarity calculation requirements necessitate the use of information retrieval strategies. We describe an OAI-PMH aggregator implementation that uses the optional “” container to re-export the results of similarity calculations. Metadata records (3751) were harvested from a NASA data provider and similarities for the records were computed. The results were useful for detecting duplicates, similarities and metadata errors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Similarity Computations with an OAI-PMH Aggregator

The proliferation of the Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH) has resulted in the creation of a large number of service providers, all harvesting from either data providers or aggregators. If data were available regarding the similarity of metadata records, service providers could track redundant records across harvests from multiple sources as well as provide addi...

متن کامل

Using OAI-PMH and METS for exporting metadata and digital objects between repositories

Purpose – To examine the relationship between deposit of electronic theses in institutional and archival repositories. Specifically the paper considers the automated export of theses for deposit in the archival repository in continuation of the existing arrangement in Wales for paper-based theses. Design/methodology/approach – The paper presents a description of software that makes use of the O...

متن کامل

REPOX - A Framework for Metadata Interchange

This demonstration presents an XML framework for metadata interchange. REPOX has two goals: to be a means for libraries and other cultural institutions to provide OAI-PMH access to their metadata records, independently of their original format, with a tool that is easy to install, use and deploy; and to be used as an aggregator of OAI-PMH Data Sources. The records are stored internally in XML a...

متن کامل

Collecting metadata from institutional repositories

The purpose of this article is to review metadata issues identified in recent research carried out in Scotland on services based on metadata aggregation via OAI-PMH, and to examine the role of collection-level description in managing ingest to harvested repositories, subsequent harvesting by secondary aggregators, and the contextualisation of institutional and aggregated repositories in the wid...

متن کامل

Interweaving OAI-PMH data sources with the linked data cloud

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) has found wide-spread adoption for exchanging bibliographic metadata. In parallel, the W3C’s Linked Data Initiative exposes and interlinks structured data from a variety of data sources on the Web. Since many of these data sources contain valuable information for institutional repositories (e.g., shared concept definitions,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره cs.DL/0401001  شماره 

صفحات  -

تاریخ انتشار 2004